Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a draft for LAI #4771

Merged
merged 5 commits into from
Feb 25, 2024
Merged

Added a draft for LAI #4771

merged 5 commits into from
Feb 25, 2024

Conversation

GallVp
Copy link
Member

@GallVp GallVp commented Jan 18, 2024

PR checklist

Closes #XXX

  • Added a draft for LAI
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • PROFILE=docker pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=singularity pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=conda pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware

@GallVp GallVp requested a review from a team as a code owner January 18, 2024 23:42
@GallVp GallVp requested a review from SPPearce January 18, 2024 23:42
@GallVp GallVp added the WIP Work in progress label Jan 18, 2024
@GallVp
Copy link
Member Author

GallVp commented Jan 18, 2024

Please do not merge yet. I need to add test data. The current genome files are either too small or have been soft masked where as this tool works on the repeat space and requires an unmasked genome.

@mahesh-panchal mahesh-panchal marked this pull request as draft January 19, 2024 08:26
Copy link
Member

@mahesh-panchal mahesh-panchal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted to draft. When you're ready for reviewing for merge, click ready for review.

sed \\
'/^>/ s/\\s.*\$//' \\
$fasta \\
> for_lai_no_comments.fsa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reproducibility/output provanence, add this file as an optional output, and use an input boolean to say whether this file is kept or rm'ed in the script

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this logic entirely as it was a partial solution. We need a separate module to format a file for LAI as it also requires alpha numeric fasta ids of 13 characters max. I have implemented a module which creates a mapping and a separate module which does reverse mapping. I'll upload those as part of the FASTA_LTRRETRIEVER_LAI workflow.

- bioconda
- defaults
dependencies:
- "bioconda::LTR_retriever=2.9.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The placing of these modules is getting confusing. The package names and the tools inside them are not really findable with what we've been doing lately. I'm starting to think we need to enforce that the tool should be named after the package it's in followed by the tool name. I.e. in this case we get LTRRETRIEVER_LAI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. Let's move it to LTRRETRIEVER_LAI because historically LAI never had its own GitHub repo or a Bioconda recipe. It is a script which has its own publication but can only be applied to the outputs from LTRRETRIEVER.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the module is LTRRETRIEVER_LAI

mv \\
$lai_output_name \\
"${prefix}.LAI.out" \\
|| echo "LAI did not produce the output file"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
|| echo "LAI did not produce the output file"
|| echo "No LTR annotations were found by RepeatMasker."

I think it would be better to echo what the implication is. I'm not sure my suggestion is correct, but please state what no output implies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated this to echo "LAI failed to estimate assembly index. See ${prefix}.LAI.log"

def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def monoploid_param = monoploid_seqs ? "-mono $monoploid_seqs" : ''
def lai_output_name = monoploid_seqs ? "${annotation_out}.${monoploid_seqs}.out.LAI" : "${annotation_out}.LAI"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When monoploid_seqs is [] this name is going to be ${annotation_out}..out.LAI (i.e. with the double .). Is that really the case when monoploid_seqs is not given, otherwise the mv will fail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I need to add a test to verify this functionality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I am testing this behaviour here

and here

test("stub_with_monoploid_seqs") {

Thank you @mahesh-panchal

@GallVp
Copy link
Member Author

GallVp commented Feb 12, 2024

Hi @mahesh-panchal

I have added test data here: nf-core/test-datasets#1095. Kindly review the test data PR if possible for you. Many Thanks!

@GallVp GallVp removed the WIP Work in progress label Feb 22, 2024
@GallVp GallVp marked this pull request as ready for review February 22, 2024 22:39
@GallVp GallVp added this pull request to the merge queue Feb 25, 2024
Merged via the queue into nf-core:master with commit 26c8a95 Feb 25, 2024
11 checks passed
@GallVp GallVp deleted the lai branch February 25, 2024 21:05
@GallVp GallVp mentioned this pull request Feb 25, 2024
4 tasks
jch-13 pushed a commit to jch-13/modules that referenced this pull request Mar 19, 2024
* Added a draft for LAI

* Added tests with larger local data

* Added ltrharvest to lai test

* Updated tests for lai

* Removed unnecessary args
jennylsmith pushed a commit to RSC-RP/modules that referenced this pull request Mar 20, 2024
* Added a draft for LAI

* Added tests with larger local data

* Added ltrharvest to lai test

* Updated tests for lai

* Removed unnecessary args
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants